A new approach to sequence comparison : Normalized sequence
نویسنده
چکیده
The Smith-Waterman algorithm for local sequence alignment is one of the most important techniques in computational molecular biology. This ingenious dynamic programming approach was designed to reveal the highly conserved fragments by discarding poorly conserved initial and terminal segments. However, the existing notion of local similarity has a serious aw: it does not discard poorly conserved intermediate segments. The Smith-Waterman algorithm nds the local alignment with maximal score but it is unable to nd local alignment with maximum degree of similarity (e.g., maximal percent of matches). Moreover, there is still no eecient algorithm that answers the following natural question: do two sequences share a (suuciently long) fragment with more than 70% of similarity? As a result, the local alignment sometimes produces a mosaic of well-conserved fragments artiicially connected by poorly-conserved or even unrelated fragments. This may lead to problems in comparison of long genomic sequences and comparative gene prediction as recently pointed out by Zhang et al. (1999). In this paper we propose a new sequence comparison algorithm (normalized local alignment) that reports the regions with maximum degree of similarity. The algorithm is based on fractional programming and its running time is O(n 2 log n). In practice, normalized local alignment is only 3-5 times slower than the standard Smith-Waterman algorithm.
منابع مشابه
Improving the Performance of Bayesian Estimation Methods in Estimations of Shift Point and Comparison with MLE Approach
A Bayesian analysis is used to detect a change-point in a sequence of independent random variables from exponential distributions. In This paper, we try to estimate change point which occurs in any sequence of independent exponential observations. The Bayes estimators are derived for change point, the rate of exponential distribution before shift and the rate of exponential distribution after s...
متن کاملA fuzzy multi-objective linear programming approach for solving a new multi-objective job shop scheduling with sequence-dependent setup times
This paper presents a new mathematical model for a bi-objective job shop scheduling problem with sequence-dependent setup times that minimizes the weighted mean completion time and the weighted mean tardiness time. For solving this multi-objective model, we develop a fuzzy multi-objective linear programming (FMOLP) model. In this problem, a proposed FMOLP method is applied with respect to the o...
متن کاملOn difference sequence spaces defined by Orlicz functions without convexity
In this paper, we first define spaces of single difference sequences defined by a sequence of Orlicz functions without convexity and investigate their properties. Then we extend this idea to spaces of double sequences and present a new matrix theoretic approach construction of such double sequence spaces.
متن کاملDetection of New Silent Mutation at 348 bp Position in a CD18 Gene in Holstein Cattle Normal and Heterozygous for Bovine Leukocyte Adhesion Deficiency Syndrome
In India, Holstein and its crosses are being used extensively in breeding programmes and all these breeding bulls are screened for autosomal recessive genes. Blood samples are collected in ethylenediaminetetraacetic acid (EDTA) coated tubes and DNA was isolated by using phenol-chloroform method. Polymerase chain reaction restriction fragment length polymorphism (PCR-RFLP) wereperformed by using...
متن کاملA Multi-objective Immune System for a New Bi-objective Permutation Flowshop Problem with Sequence-dependent Setup Times
We present a new mathematical model for a permutation flowshop scheduling problem with sequence-dependent setup times considering minimization of two objectives, namely makespan and weighted mean total earliness/tardiness. Only small-sized problems with up to 20 jobs can be solved by the proposed integer programming approach. Thus, an effective multi-objective immune system (MOIS) is ...
متن کاملA New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal
The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...
متن کامل